Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Improving machine simultaneous interpretation by punctuation recovery
CHEN Yuna, SHI Xiaodong
Journal of Computer Applications    2020, 40 (4): 972-977.   DOI: 10.11772/j.issn.1001-9081.2019101711
Abstract588)      PDF (1373KB)(483)       Save
In the Machine Simultaneous Interpretation(MSI)pipeline system,semantic incompleteness occurs when the Automatic Speech Recognition(ASR)outputs are directly input into Neural Machine Translation(NMT). To address this problem,a model based on Bidirectional Encoder Representation from Transformers (BERT) and Focal Loss was proposed. Firstly,several segments generated by the ASR system were cached and formed into a string. Then a BERT-based sequence labeling model was used to recover the punctuations of the string,and Focal Loss was used as the loss function in the process of model training to alleviate the class imbalance problem of more unpunctuated samples than punctuated samples. Finally,the punctuation-restored string was input into NMT. Experimental results on English-German and Chinese-English translation show that in term of translation quality,the MSI using the proposed punctuation recovery model has the improvement of 8. 19 BLEU and 4. 24 BLEU respectively compared with the MSI with ASR outputs directly inputting into NMT,and has the improvement of 2. 28 BLEU and 3. 66 BLEU respectively compared with the MSI using punctuation recovery model based on bi-directional recurrent neural network with attention mechanism. Therefore,the proposed model can be effectively applied to MSI.
Reference | Related Articles | Metrics